Online Unsupervised Multilingual Acoustic Model Adaptation for Nonnative Asr

نویسندگان

  • Sethserey Sam
  • Eric Castelli
  • Laurent Besacier
چکیده

Automatic speech recognition (ASR) is currently one of the main research interests in computer science. Hence, many ASR systems are available in the market. Yet, the performance of speech and language recognition systems is poor on nonnative speech. The challenge for nonnative speech recognition is to maximize the accuracy of a speech recognition system when only a small amount of nonnative data is available. Recent studies on nonnative speech recognition were focus on supervised context in which spoken languages (L2) and speakers’ mother tongue languages (L1) are known in advance. In this paper, we want to study the adaptation approach of nonnative speech in which both L1 and L2 are unknown in advance. Such new approach is called online-unsupervised multilingual acoustic model adaptation. Thus, “unsupervised” means we don’t know in advance the nonnative speech utterance (it’s L1 and L2); and “online” means the adaptation is made during the decoding. Thus, the proposed approach decomposes into two stages. The first stage, contained language observer module, aims to recover the linguistic information (spoken languages and the origins of the speakers) of the unknown speech utterances to be decoded. The second stage is to adapt the multilingual acoustic model based on knowledge provided by language observer module. It is clear that the multilingual acoustic model must contain the acoustic units of L2 and L1. In this study, we report on the acoustic model adaptation for improving the recognition of nonnative speech in English, French and Vietnamese, spoken by speakers of different origins. Results degradation around 7% of baseline systems’ phone error rates (PERs) obtained from the experiments demonstrate the feasibility of the method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised acoustic model adaptation for multi-origin non native ASR

To date, the performance of speech and language recognition systems is poor on non-native speech. The challenge for nonnative speech recognition is to maximize the accuracy of a speech recognition system when only a small amount of nonnative data is available. We report on the acoustic model adaptation for improving the recognition of non-native speech in English, French and Vietnamese, spoken ...

متن کامل

Rapid Building of an ASR System for Under-Resourced Languages Based on Multilingual Unsupervised Training

This paper presents our work on rapid language adaptation of acoustic models based on multilingual cross-language bootstrapping and unsupervised training. We used Automatic Speech Recognition (ASR) systems in the six source languages English, French, German, Spanish, Bulgarian and Polish to build from scratch an ASR system for Vietnamese, an underresourced language. System building was performe...

متن کامل

Speech alignment and recognition experiments for Luxembourgish

Luxembourgish, embedded in a multilingual context on the divide between Romance and Germanic cultures, remains one of Europe’s under-described languages. In this paper, we propose to study acoustic similarities between Luxembourgish and major contact languages (German, French, English) with the help of automatic speech alignment and recognition systems. Experiments were run using monolingual ac...

متن کامل

Multilingual Pronunciat Improving Multilingual S

Multilinguality aspects are becoming increasingly important in the Automatic Speech Recognition (ASR) systems. It is apparent that coping with large variability of the speech signal is an even bigger challenge in multilingual ASR systems than it has been in conventional monolingual systems. In this paper, we address the importance of combining multilingual pronunciation modeling and acoustic mo...

متن کامل

On-line learning of acoustic and lexical units for domain-independent ASR

We are interested in on-line acquisition of acoustic, lexical and semantic units from spontaneous speech. Traditional ASR techniques require the domain-speci c knowledge of acoustic, lexicon data and more importantly the word probability distributions. In this paper we propose an algorithm for unsupervised learning of acoustic and lexical units from out-of-domain speech data. The new lexical un...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012